منابع مشابه
Indexing Compressed Text
As a result of the rapid growth of the volume of electronic data, text compression and indexing techniques are receiving more and more attention. These two issues are usually treated as independent problems, but approaches of combining them have recently attracted the attention of researchers. In this thesis, we review and test some of the more effective and some of the more theoretically inter...
متن کاملUniversal Compressed Text Indexing
The rise of repetitive datasets has lately generated a lot of interest in compressed self-indexes based on dictionary compression, a rich and heterogeneous family that exploits text repetitions in different ways. For each such compression scheme, several different indexing solutions have been proposed in the last two decades. To date, the fastest indexes for repetitive texts are based on the ru...
متن کاملIndexing Compressed Text
We present a technique to build an index based on suux arrays for compressed texts. We also propose a compression scheme for textual databases based on words that generates a compression code that preserves the lexicographical ordering of the text words. As a consequence it permits the sorting of the compressed strings to generate the suux array without decompressing. As the compressed text is ...
متن کاملCompressed Text Indexing with Wildcards
Let T = T1φ 1T2φ k2 · · ·φdTd+1 be a text of total length n, where characters of each Ti are chosen from an alphabet Σ of size σ, and φ denotes a wildcard symbol. The text indexing with wildcards problem is to index T such that when we are given a query pattern P , we can locate the occurrences of P in T efficiently. This problem has been applied in indexing genomic sequences that contain singl...
متن کاملOn Entropy-Compressed Text Indexing in External Memory
A new trend in the field of pattern matching is to design indexing data structures which take space very close to that required by the indexed text (in entropy-compressed form) and also simultaneously achieve good query performance. Two popular indexes, namely the FM-index [Ferragina and Manzini, 2005] and the CSA [Grossi and Vitter 2005], achieve this goal by exploiting the Burrows-Wheeler tra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the ACM
سال: 2005
ISSN: 0004-5411,1557-735X
DOI: 10.1145/1082036.1082039